Unlabeled data: Now it helps, now it doesn't

نویسندگان

  • Aarti Singh
  • Robert D. Nowak
  • Xiaojin Zhu
چکیده

Empirical evidence shows that in favorable situations semi-supervised learning (SSL) algorithms can capitalize on the abundance of unlabeled training data to improve the performance of a learning task, in the sense that fewer labeled training data are needed to achieve a target error bound. However, in other situations unlabeled data do not seem to help. Recent attempts at theoretically characterizing SSL gains only provide a partial and sometimes apparently conflicting explanations of whether, and to what extent, unlabeled data can help. In this paper, we attempt to bridge the gap between the practice and theory of semi-supervised learning. We develop a finite sample analysis that characterizes the value of unlabeled data and quantifies the performance improvement of SSL compared to supervised learning. We show that there are large classes of problems for which SSL can significantly outperform supervised learning, in finite sample regimes and sometimes also in terms of error convergence rates.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

ERRATA for NIPS 2008 paper “ Unlabeled data : Now it helps , now it doesn ’ t ”

1) Definition of margin γ We replace the ‖ · ‖∞ norm in the definition of the margin, with infx∈X | · |. The correction definition is given as below. The collection PXY is indexed by a margin parameter γ, which denotes the minimum width of a decision set or separation between the component support sets Ck. The margin γ is assigned a positive sign if there is no overlap between components, other...

متن کامل

P-V-L Deep: A Big Data Analytics Solution for Now-casting in Monetary Policy

The development of new technologies has confronted the entire domain of science and industry with issues of big data's scalability as well as its integration with the purpose of forecasting analytics in its life cycle. In predictive analytics, the forecast of near-future and recent past - or in other words, the now-casting - is the continuous study of real-time events and constantly updated whe...

متن کامل

Integrated ERP System for Improving the Functional efficiency of the organization by Customized Architecture

An ERP is a kind of package which consist front end and backend as DBMS like a collection of DBMSs. You can create DBMS to manage one aspect of your business. For example, a publishing house has a database of books that keeps information about books such as Author Name, Title, Translator Name, etc. But this database app only helps enter books' data and search them. It doesn't help them, for exa...

متن کامل

Big Data Analytics and Now-casting: A Comprehensive Model for Eventuality of Forecasting and Predictive Policies of Policy-making Institutions

The ability of now-casting and eventuality is the most crucial and vital achievement of big data analytics in the area of policy-making. To recognize the trends and to render a real image of the current condition and alarming immediate indicators, the significance and the specific positions of big data in policy-making are undeniable. Moreover, the requirement for policy-making institutions to ...

متن کامل

I-4: Is It Time Now to Cancel Fresh EmbryoTransfer

The implantation of embryo depends on the quality of the embryo and the receptivity of the endometrium. While the factors affecting embryo quality were optimized, the implantation rates did not reach the desirable levels. This directs the blaming for endometrial receptivity which already has been affected by supra-physiological levels of hormone, namely estradiol, in controlled ovarian hypersti...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008